Skip to content

fix double-decode and double-encode in byte stuffing codec#6

Closed
jktjkt wants to merge 1 commit intogertvdijk:developfrom
jktjkt:fix/byte-stuffing
Closed

fix double-decode and double-encode in byte stuffing codec#6
jktjkt wants to merge 1 commit intogertvdijk:developfrom
jktjkt:fix/byte-stuffing

Conversation

@jktjkt
Copy link
Copy Markdown
Contributor

@jktjkt jktjkt commented Apr 11, 2026

Calling the replacement function multiple times is wrong because it looks at partially decoded data which are decoded again. We could fix that by using a different order for encode and decode, but I like this implementation much more because it will also flag unrecognized or truncated stuffing sequences.

Calling the replacement function multiple times is wrong because it
looks at partially decoded data which are decoded again. We could fix
that by using a different order for encode and decode, but I like this
implementation much more because it will also flag unrecognized or
truncated stuffing sequences.
@gertvdijk gertvdijk mentioned this pull request Apr 17, 2026
@gertvdijk
Copy link
Copy Markdown
Owner

Thanks! I like how it fixes an actual bug and catches invalid cases with the exception raised. Also your test cases using bytes.fromhex() are much more readable than mine - I think I will adopt that across all tests actually in a separate change!

What I don't really like is the while-loop that's seem not really Pythonic. I think this is a great use case for a generator function which can be extracted from the encode() / decode() functions while preserving your approach and the bahaviour of it. I think that would improve it a lot.

I've given it a quick refactor and not much is left as-is with these edits, so I'm leaning towards adopting this approach "my way" and proposing that as a PR here for you to review. However, it remains your idea and I'd like it to be reflected in the commit history. Would you prefer to be the author of the commit or a co-author?

Alternatively, if you would like to adopt the idea refactor below in the expandable section and amend your commit with it, that would be great too!

while at it, perhaps a line on what 'stuffing' is (escaping basically) could be added too.

Something like this:

a class variable with the stuffable bytes to avoid duplication:

    _stuffable_bytes: ClassVar[frozenset[int]] = frozenset({
        constants.ByteCode.STUFFING.value,
        constants.ByteCode.ACK.value,
        constants.ByteCode.START_FROM_METER.value,
        constants.ByteCode.START_TO_METER.value,
        constants.ByteCode.STOP.value,
    })
    @classmethod
    def _iter_destuffed_bytes(cls, stuffed: bytes) -> Iterator[int]:
        """Yield destuffed byte values from a stuffed byte sequence."""
        in_stuffing = False

        for raw_byte in stuffed:
            if in_stuffing:
                in_stuffing = False
                xored = raw_byte ^ 0xFF
                if xored not in cls._stuffable_bytes:
                    raise InvalidStuffingByteError(raw_byte=raw_byte)
                yield xored
                continue

            if raw_byte == constants.ByteCode.STUFFING.value:
                in_stuffing = True
                continue

            yield raw_byte

        if in_stuffing:
            raise TruncatedStuffingError

    @classmethod
    def _iter_stuffed_bytes(cls, raw: bytes) -> Iterator[bytes]:
        """Yield stuffed byte chunks from an unescaped byte sequence."""
        stuffing_prefix = constants.ByteCode.STUFFING.value.to_bytes(1, "big")

        for raw_byte in raw:
            if raw_byte in cls._stuffable_bytes:
                yield stuffing_prefix + (raw_byte ^ 0xFF).to_bytes(1, "big")
            else:
                yield raw_byte.to_bytes(1, "big")

[...]

then decoding would be

        data_bytes = frame[1:-1]
        return cast(DataLinkBytes, bytes(self._iter_destuffed_bytes(data_bytes)))

and encoding

        stuffed = b"".join(self._iter_stuffed_bytes(raw))

        frame = (
            self._start_byte.to_bytes(1, "big")
            + stuffed
            + constants.ByteCode.STOP.value.to_bytes(1, "big")
        )
        return cast(PhysicalBytes, frame)

Let me know what you think.

@gertvdijk
Copy link
Copy Markdown
Owner

Anyway, I've refactored it to my liking and I'd like you to review this @jktjkt: #7.

Thanks again, and please LMK what you think.

@jktjkt
Copy link
Copy Markdown
Contributor Author

jktjkt commented Apr 19, 2026

I agree that a generator is a nice improvement, thanks. A Co-authored-by in either direction is perfectly OK with me. Since you wrote the actual implementation, I think that it's only fair to list yourself as the primary author, and me in the commit footer, but I don't really care too much.

I'll read the updated PR and test this tomorrow (we have some heat meters which triggered the double decoding during an actual log readout). Let's close this one in favor of #7.

@jktjkt jktjkt closed this Apr 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants